Estimating the Confidence Interval for Prediction Errors of Support Vector Machine Classifiers
نویسندگان
چکیده
Support vector machine (SVM) is one of the most popular and promising classification algorithms. After a classification rule is constructed via the SVM, it is essential to evaluate its prediction accuracy. In this paper, we develop procedures for obtaining both point and interval estimators for the prediction error. Under mild regularity conditions, we derive the consistency and asymptotic normality of the prediction error estimators for SVM with finite-dimensional kernels. A perturbationresampling procedure is proposed to obtain interval estimates for the prediction error in practice. With numerical studies on simulated data and a benchmark repository, we recommend the use of interval estimates centered at the cross-validated point estimates for the prediction error. Further applications of the proposed procedure in model evaluation and feature selection are illustrated with two examples.
منابع مشابه
Application of ensemble learning techniques to model the atmospheric concentration of SO2
In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...
متن کاملSupport Vector Machine Based Facies Classification Using Seismic Attributes in an Oil Field of Iran
Seismic facies analysis (SFA) aims to classify similar seismic traces based on amplitude, phase, frequency, and other seismic attributes. SFA has proven useful in interpreting seismic data, allowing significant information on subsurface geological structures to be extracted. While facies analysis has been widely investigated through unsupervised-classification-based studies, there are few cases...
متن کاملThe Porosity Prediction of One of Iran South Oil Field Carbonate Reservoirs Using Support Vector Regression
Porosity is considered as an important petrophysical parameter in characterizing reservoirs, calculating in-situ oil reserves, and production evaluation. Nowadays, using intelligent techniques has become a popular method for porosity estimation. Support vector machine (SVM) a new intelligent method with a great generalization potential of modeling non-linear relationships has been introduced fo...
متن کاملAutomatic classification of highly related Malate Dehydrogenase and L-Lactate Dehydrogenase based on 3D-pattern of active sites
Accurate protein function prediction is an important subject in bioinformatics, especially wheresequentially and structurally similar proteins have different functions. Malate dehydrogenaseand L-lactate dehydrogenase are two evolutionary related enzymes, which exist in a widevariety of organisms. These enzymes are sequentially and structurally similar and sharecommon active site residues, spati...
متن کاملThe Application of Least Square Support Vector Machine as a Mathematical Algorithm for Diagnosing Drilling Effectivity in Shaly Formations
The problem of slow drilling in deep shale formations occurs worldwide causing significant expenses to the oil industry. Bit balling which is widely considered as the main cause of poor bit performance in shales, especially deep shales, is being drilled with water-based mud. Therefore, efforts have been made to develop a model to diagnose drilling effectivity. Hence, we arrived at graphical cor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of Machine Learning Research
دوره 9 شماره
صفحات -
تاریخ انتشار 2008